AITopics | lda algorithm

Collaborating Authors

lda algorithm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Latent Dirichlet Allocation

#artificialintelligenceAug-1-2022, 16:45:45 GMT

Latent Dirichlet Allocation, or LDA for short, is an unsupervised machine learning algorithm. Similar to the clustering algorithm K-means, LDA will attempt to group words and documents into a predefined number of clusters (i.e. These topics can then be used to organize and search through documents. The most popular methods for estimating the LDA model is Gibbs sampling. Let's walk through one iteration of the algorithm.

latent dirichlet allocation, probability, topic 0, (11 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.71)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.55)

Add feedback

AI supported Topic Modeling using KNIME-Workflows

Qundus, Jamal Al, Peikert, Silvio, Paschke, Adrian

arXiv.org Artificial IntelligenceApr-15-2021

Topic modeling algorithms traditionally model topics as list of weighted terms. These topic models can be used effectively to classify texts or to support text mining tasks such as text summarization or fact extraction. The general procedure relies on statistical analysis of term frequencies. The focus of this work is on the implementation of the knowledge-based topic modelling services in a KNIME workflow. A brief description and evaluation of the DBPedia-based enrichment approach and the comparative evaluation of enriched topic models will be outlined based on our previous work. DBpedia-Spotlight is used to identify entities in the input text and information from DBpedia is used to extend these entities. We provide a workflow developed in KNIME implementing this approach and perform a result comparison of topic modeling supported by knowledge base information to traditional LDA. This topic modeling approach allows semantic interpretation both by algorithms and by humans.

knowledge base, topic model, topic modeling, (11 more...)

arXiv.org Artificial Intelligence

2104.09428

Country:

North America > United States (0.16)
Europe > Germany > Berlin (0.05)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Industry: Government > Regional Government (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.64)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.64)

Add feedback

LDA for Text Summarization and Topic Detection - DZone AI

#artificialintelligenceFeb-19-2019, 07:27:25 GMT

Machine learning clustering techniques are not the only way to extract topics from a text data set. Text mining literature has proposed a number of statistical models, known as probabilistic topic models, to detect topics from an unlabeled set of documents. One of the most popular models is the latent Dirichlet allocation (LDA) algorithm developed by Blei, Ng, and Jordan [i]. LDA is a generative unsupervised probabilistic algorithm that isolates the top K topics in a data set as described by the most relevant N keywords. In other words, the documents in the data set are represented as random mixtures of latent topics, where each topic is characterized by a Dirichlet distribution over a fixed vocabulary.

algorithm, keyword, lda algorithm, (10 more...)

#artificialintelligence

Country:

Asia > Middle East > Jordan (0.25)
South America > Paraguay > Asunción > Asunción (0.05)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.58)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.56)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.54)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.35)

Add feedback

Fast Online Incremental Learning on Mixture Streaming Data

Wang, Yi (Dalian University of Technology) | Fan, Xin (Dalian University of Technology) | Luo, Zhongxuan (Dalian University of Technology) | Wang, Tianzhu ( No. 254, Deta Leisure Town, Jinzhou New District, Dalian ) | Min, Maomao (Dalian University of Technology) | Luo, Jiebo (University of Rochester)

AAAI ConferencesFeb-14-2017

The explosion of streaming data poses challenges to feature learning methods including linear discriminant analysis (LDA). Many existing LDA algorithms are not efficient enough to incrementally update with samples that sequentially arrive in various manners. First, we propose a new fast batch LDA (FLDA/QR) learning algorithm that uses the cluster centers to solve a lower triangular system that is optimized by the Cholesky-factorization. To take advantage of the intrinsically incremental mechanism of the matrix, we further develop an exact incremental algorithm (IFLDA/QR). The Gram-Schmidt process with reorthogonalization in IFLDA/QR significantly saves the space and time expenses compared with the rank-one QR-updating of most existing methods. IFLDA/QR is able to handle streaming data containing 1) new labeled samples in the existing classes, 2) samples of an entirely new (novel) class, and more significantly, 3) a chunk of examples mixed with those in 1) and 2). Both theoretical analysis and numerical experiments have demonstrated much lower space and time costs (2~10 times faster) than the state of the art, with comparable classification accuracy.

algorithm, artificial intelligence, machine learning, (15 more...)

AAAI Conferences

Thirty-First AAAI Conference on Artificial Intelligence

Country: Asia > China > Liaoning Province (0.14)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Towards Big Topic Modeling

Yan, Jian-Feng, Zeng, Jia, Liu, Zhi-Qiang, Gao, Yang

arXiv.org Machine LearningNov-17-2013

To solve the big topic modeling problem, we need to reduce both time and space complexities of batch latent Dirichlet allocation (LDA) algorithms. Although parallel LDA algorithms on the multi-processor architecture have low time and space complexities, their communication costs among processors often scale linearly with the vocabulary size and the number of topics, leading to a serious scalability problem. To reduce the communication complexity among processors for a better scalability, we propose a novel communication-efficient parallel topic modeling architecture based on power law, which consumes orders of magnitude less communication time when the number of topics is large. We combine the proposed communication-efficient parallel architecture with the online belief propagation (OBP) algorithm referred to as POBP for big topic modeling tasks. Extensive empirical results confirm that POBP has the following advantages to solve the big topic modeling problem: 1) high accuracy, 2) communication-efficient, 3) fast speed, and 4) constant memory usage when compared with recent state-of-the-art parallel LDA algorithms on the multi-processor architecture.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

1311.415

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.90)
Information Technology > Artificial Intelligence > Natural Language (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback